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procedure  known  as  sequential  analysis  which  measured  the  error  rate  or  produc¬ 
tion  lots.  The  procedure  cm  be  used  on  Test  Program  Sets  (TPS)  as  well  as  pro¬ 
duction  lots  and.  In  fact,  is  advocated  in  MIL-ST1J-2U7/.  It  is  apparent  Chat 
sequential  analysis  Is  rarely  used,  however,  in  the  validation  or  TPS's.  This  is 
probably  due  to  two  reasons:  little  understanding  ot  Che  method  ot  sequential 
analysis  in  the  TPS  community  and  poor  choices  of  variables  in  MIL-STO-2U/7 ; 
namely,  a  TPS  with  2*5%  errors  is  considered  to  be  good,  and  a  TPS  is  not  consid¬ 
ered  bad  until  it  has  in  excess  of  13Z  errors. 

This  report  advocates  Che  use  of  sequential  analysis  for  validating  TPS's, 
but  uses  variables  different  from  MlL-STD-2077.  It  also  describes  an  embellish¬ 
ment  to  the  procedure  which  makes  its  use  more  reasonable. 
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IMTRODOCTION 


In  1982,  an  analysis  of  Test  Program  Set  (TPS)  acceptance  plans  used  by  the 
Department  of  the  Army  was  performed  at  ARDC.  It  Indicated  that  most  TPS  accept¬ 
ance  plans  call  for  the  Insertion  of  a  number  of  faults  In  an  otherwise  icno%m 
good  Unit  Under  Test  (UUT)  to  ensure  that  the  test  program  correctly  Identifies 
these  faults.  The  number  of  faults  to  be  checked  In  this  manner,  the  number  of 
failures  acceptable,  or  the  action  to  be  performed  In  case  of  a  failure  may  vary 
from  plan  to  plan.  Often  these  plans  ore  generated  by  intuition. 

The  plan  in  general  use  within  AiU)C  is:  (1)  ensure  that  the  TPS  runs  pro¬ 
perly  against  two  good  UUT's;  (2)  ensure  that  the  TPS  properly  Identifies  10 
faults  selected  from  a  list  of  20  faults  selected  and  tested  by  the  developer; 
and  (3)  ensure  that  the  TPS  can  find  five  other  faults  selected  by  the  Quality 
Assurance  (QA)  representative.  This  plan  was  used  to  accept  TPS's  developed  to 
support  several  weapons  systems,  and  while  the  plan  gave  reasonable  assurance 
that  It  ensured  acceptance  of  good  TPS's  and  rejected  bad  ones,  actual  confidence 
levels  based  on  mathematical  analysis  were  elusive.  Research  then  commenced  on 
application  of  a  standard  statistical  theory. 

This  report  details  the  results  of  some  of  this  research.  It  Includes  a 
generalized  model  for  developing  a  TPS  accepance  procedure.  An  interesting  as¬ 
pect  of  this  plan  Is  that  It  can  be  applied  to  a  set  of  TPS's  as  well  as  a  single 
TPS.  This  may  prove  Invaluable  for  testing  TPS's  of  the  future  which  may  be 
generated  by  artificial  Intelligence  methods. 


APPLICATION  OF  SBQDENTIAL  SAMPLING  PLANS  FOR  TEST  PROGRAM  SET  ACCEPTANCE 


TPS  acceptance  plans  In  use  today  call  for  Inserting  a  specific  number  of 
faults  Into  a  UUT  and  ensuring  that  the  TPS  correctly  locates  these  faults.  Two 
problems  are  associated  with  this  aspect  of  a  TPS  acceptance  plan:  (1)  How  many 
faults  should  be  Inserted  to  guarantee  a  certain  confidence  level  that  the  TPS  Is 
good?  (2)  What  faults  should  be  chosen  to  perform  the  acceptance  test? 

MIL-STD-2077 ,  General  Requirements  for  Test  Program  Sets,  addresses  both  of 
these  points,  but  Is  generally  not  closely  followed.  The  test  plan  described 
herein  follows  the  essence  of  the  plan  set  forth  In  MIL-STD-2077,  but  includes 
certain  changes  and  some  amplification. 

The  significant  differences  between  this  plan  and  that  of  MIL-STD-2077  are: 

1.  The  acceptance  levels  herein  are  more  stringent.  MIL-STD-2077  spec¬ 
ifies  2.5%  errors  as  acceptable;  this  plan  specifies  0.1%  error  as  acceptable. 

2.  Classification  of  faults  according  to  their  criticality  Is  taken 
into  account  In  this  plan,  whereas  MIL-STD-2077  classifies  faults  but  does  not 
take  the  classification  into  account. 


3*  The  method  of  choosing  the  faults  to  be  Inserted  is  based  upon  com¬ 
ponent  failure  rate  data  In  both  plans,  but  In  this  plan  modification  of  the 
failure  rate  data  with  a  stress  factor  based  on  an  engineering  analysis  of  the 
circuit  design  is  advocated. 

The  foundation  of  the  test  plan  to  be  followed  Is  based  upon  established 
prediction  methodologies,  known  as  se<4uentlal  analysis,  set  down  by  A.  Wald  (ref 
1)  and  others  during  the  late  1940 's.  This  procedure  will  predict  acceptance  or 
rejection  of  manufactured  lots  based  on  an  attribute  sampling  plan.  This  sequen¬ 
tial  sampling  plan  Is  an  accepted  sampling  technique  utilized  by  the  business 
community  as  well  as  the  armed  services. 

Sequential  analysis  Is  simply  "...  a  method  ot  statistical  Inference  whose 
characteristic  feature  Is  that  the  number  of  observations  required  by  the  proce¬ 
dure  Is  not  determined  in  advance  of  the  experiment"  (ref  1). 

Some  of  the  variables  associated  with  this  plan  need  to  be  clearly  defined. 

Inspection  lot  -  The  TPS  Is  considered  to  be  an  Inspection  lot. 

Unit  of  product  -  A  unit  of  product  Is  the  Item  Inspected  to  determine 
Its  classification  as  defective  or  non-defective.  In  the  TPS,  each  definable 
failure  mode  Is  a  unit  of  product. 

Acceptable  quality  level  (AQL)  -  The  maximum  percent  of  defective  Items 
within  a  lot  (a  TPS)  that,  for  purposes  of  sampling,  can  be  considered  satisfac¬ 
tory  to  accept  that  lot.  The  value  used  in  this  plan  is  0.1%. 

Lot  tolerance  percent  defective  (LTPD)  -  The  lowest  quality  level  that 
the  customer  Is  willing  to  accept.  The  value  used  In  this  plan  Is  lOX. 

[Note:  Since  all  possible  faults  cannot  be  tested,  and  there  will  be  proba¬ 
bilities  of  errors  associated  with  any  sampling  tests,  the  developer  will  strive 
for  more  stringent  requirements  (AQL)  In  order  to  assure  that  the  customer's 
requirements  (LTPD)  are  met.] 

Producer's  risk  -  The  chance  that  a  good  lot  could  be  rejected. 

Customer's  risk  -  The  chance  that  a  bad  lot  could  be  accepted. 

Sample  -  Units  of  product  selected  from  the  Inspection  lot.  The  number 
of  units  of  product  Is  the  sample  size.  For  the  purposes  of  this  study,  these 
are  the  faults  to  be  inserted. 

The  acceptance  sampling  plan  herein  has  many  similarities  with  the  sampling 
plan  presented  in  MIL-STD-2077 ,  both  of  which  are  sequential  sampling  plans  based 
on  Wald's  work.  We  wish  to  accept  a  TPS  with  0.1%  detectives  or  less  (AQL),  and 
we  wish  to  set  a  5%  limit  to  the  chance  ot  rejecting  a  TPS  that  meets  this  stand¬ 
ard  (producer  risk);  also  we  miy  wish  to  '•eject  a  TPS  that  has  10%  or  more  det- 
fectlves  (LTPD),  and  we  wish  to  set  a  v;;rlable  limit  (from  1%  to  25%)  on  the 
chance  of  accepting  a  TPS  with  this  ratio  >f  detectives  (customer  risk).  In  this 
plan,  P,  will  equal  AQL  and  P2  will  equal  ,TPD.  The  AQL  and  LTPD  are  terms  used 
In  MIL-STD-2077;  Pj  and  P2  are  terms  used  11  the  mathematical  equations. 


Another  view  of  the  meanings  and  Interrelations  ot  AQL,  LTPU,  customer  risk, 
and  producer  risk  Is  as  follows:  a  TPS  will  have  a  certain  error  rate,  P,  which 
we  want  to  measure;  but,  since  the  number  of  possible  faults  in  the  UUT  that  the 
TPS  tests  may  easily  number  in  the  hundreds,  and  since  we  cannot  afford  to  check 
them  all,  we  want  to  measure  this  error  rate  usln„  sequential  analysis*  We  are 

going  to  set  two  error  rates  as  boundaries  for  our  test.  AQL  (Pj)  IsO.lX.  If 

the  T^S  has  only  one  error  or  less  in  1,000,  we  will  consider  It  acceptable. 
LTPD  (P2)  is  10%.  If  the  TPS  has  10  errors  or  more  out  of  100,  we  want  to  reject 

it.  We  realize  that  because  we  are  going  to  test  less  than  1,000  or  even  100 

faults,  there  will  be  some  uncertainty  as  to  where  our  measurement  of  P  will  fall 
with  respect  to  P,  and  P2.  We  hope  we  don't  measure  P  to  be  less  than  AQL  when 
it  is  not,  but  this  could  happen,  and  we  set  a  limit  on  the  risk  we  are  willing 
to  take  that  this  does  happen.  This,  the  customer  risk,  we  will  choose  from  1% 
to  25%.  This  means  that  IX  to  25Z  of  the  TPS's  we  buy  may  have  more  than  one  out 
of  1,000  errors  In  them.  The  producer  hopes  that  we  don't  measure  P  to  be  more 
than  LTPD  and  reject  a  TPS  that  is  actually  good.  We  set  the  producer  risk  at 
5%.  This  means  that  of  the  TPS's  that  are  rejected  by  this  plan,  5%  of  them 
could  actually  be  good. 

MIL-STD-2077  sets  LTPD  at  13%  and  AQL  at  2.54;  here  we  have  chosen  LTPD  to 
be  set  at  10%  and  AQL  at  0.1%;  these  values  are  commensurate  with  modern  indus¬ 
trial  standards.  Also,  MIL-STD-2077  sets  the  consumer  risk  and  producer  risk  as 
equal  but  they  are  based  upon  mean  time  between  failure  of  the  UUT.  Here  we  have 
set  the  producer  risk  at  5%  and  advocate  selecting  consumer  risk  based  upon  mean 
time  between  failure  (MTBF)  from  table  1  which  is  similar  to  table  I  of  MIL-STD- 
2077.* 

For  this  plan,  P^  is  the  acceptance  limit  specified  by  AQL,  a  is  the  pro¬ 
ducer  risk,  l-a  is  the  probability  of  accepting  a  TPS  that  is  good,  P2  is  the 
LTPD,  and  represents  consumer  risk  from  table  1.  The  relationship  between  o, 
p,  pp  and  P2  is  shown  in  figure  1. 

?!  -  0.1%  =■  0.001  AQL 

P2  =*  10%  =  0.10  LTPD 

«  *  5%  =  0.05  Producer  risk 

l-a  -  1-0.05  “  0.95  Acceptance  probability 

P  =  10%  0.10  Customer  risk 


*  Table  1  of  this  report  Incorporates  a  more  detat led  correlation  between  consum¬ 
er  risk  and  UUT  MTBF;  namely,  a  linear  relation  which  is  derived  from  two  spec¬ 
ification  points: 

MTBF  -  50  hr  ->■  e  =  1%  and  MTBF  =  5,000  hr  -*•  3  =  25%. 
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To  construct  a  chart  for  use  in  a  sequential  sampling  plan,  we  need  to  draw 
a  graph  with  a  horizontal  axis,  n,  which  will  represent  the  number  of  faults  to 
be  inserted  in  the  UUT  to  verify  the  TPS,  and  a  vertical  axis  which  will  repre¬ 
sent  the  score  or  defects,  d,  obtained  by  the  test  subject  (£ig«  2)»  On  these 
axes  we  draw  the  parallel  lines: 

d  -  h^  +  sn 

d  =  h2  +  sn 

The  variables  describing  these  lines,  as  derived  by  Wald,  are  related  to  the 
parameters  Pj,  ?2t  n»  and  6* 

To  find  the  Y  intercepts  -  h^  and  h2  and  the  common  slope  s,  we  first  com¬ 
pute  the  auxiliary  quantities  (ref  2): 

gj  -  LOGIP2/P1I 

g2  -  L0G((1  -  Pi)/(1  -  P2)l 

a  -  L0G((1  -  B)/a] 

b  -  L0G((1  -  o)/P] 

and  then 

hj  =»  -  b/Cg^  +  g2) 
h2  =■  a/Cgj  +  g2) 

s  =  g2/<gi  +  g2) 

The  values  for  the  above-mentioned  conditions  are  shown  in  table  2«  B,  or 
consumer's  risk,  is  highlighted,  as  is  the  number  of  faults  (n),  for  easier  ref¬ 
erence  . 

Once  the  graph  has  been  drawn,  its  use  is  simple;  for  each  fault  inserted, 
the  point  (n,d)  is  plotted.  If  at  any  time  the  point  meets  or  falls  below  the 
lower  of  the  parallel  lines,  we  accept  the  lot.  If  it  meets  or  crosses  the  upper 
line,  we  reject  the  lot.  As  long  as  it  remains  between  the  lines,  we  continue 

testing. 

At  this  point  we  will  be  diverging  somewhat  from  the  plan  presented  by  Wald 
in  that  we  will  be  using  a  scoring  system  to  rate  the  defects.  We  should  note 
that  the  score  will  be  based  on  faults  inserted  on  a  circuit  board  to  test  the 
TPS. 

The  minimum  number  of  faults  to  determine  acceptance  can  be  calculated  by 
setting  d  “  0  and  solving  for  n,  where  n  «•  h^/s.  As  can  be  seen  trom  figure  3 
and  table  1,  this  can  range  from  13  faults  for  a  25Z  consumer  risk  to  44  faults 
for  a  IZ  consumer  risk.  As  indicated  by  Crow  et  al  (ref  3),  the  test  can  be 


of  faults.  If  we  have  tested  1.5  times  the  minimum  faults  and  we  are  still  in 
the  "continue  testing"  area  with  a  gradual  slope  toward  acceptance,  a  decision 
should  be  made  to  stop  testing  for  economic  reasons.  At  this  time  if  we  are 
still  in  the  continue  testing  range,  we  nust  make  a  decision  whether  or  not  to 
accept  or  reject  the  board  based  on  previous  tests. 


FAULT  SCORING 


Should  a  fault  occur  (the  TPS  cannot  find  a  fault  or  improperly  finds  a 
fault),  there  must  be  some  consideration  as  to  the  criticality  of  the  fault.  At 
the  verification  and  validation  (V&V)  testing,  there  are  several  types  of  faults 
that  might  show  up  in  a  TPS: 

A  critical  fault  is  one  which  prohibits  continuation  of  sample  testing. 
Failure  of  the  TPS  to  pass  a  performance  test  is  considered  a  critical  detect. 
If  this  defect  can  be  corrected  at  the  time  of  testing  with  a  minor  programming 
change  (tolerances  are  too  tight  to  pass  a  good  board  or  too  loose  to  call  out  a 
defective  board),  then  we  will  consider  the  fault  a  corrected  fault  and  a  penalty 
will  be  assigned  to  that  particular  fault. 

A  major  fault  is  a  fault  other  than  critical  that  is  likely  to  result  in 
failure  or  reduce  materially  the  usability  of  the  TPS  for  its  intended  purpose. 
Failure  to  Isolate  an  inserted  fault  is  considered  a  major  fault.  Once  again,  if 
the  fault  could  be  corrected  during  the  V4V  with  a  minimum  of  action  on  the  part 
of  the  programmer,  then  this  would  be  considered  a  major  fault  corrected. 

Minor  faults  are  ones  that  will  not  materially  reduce  the  ability  of  the 
TPS  to  be  used  for  its  Intended  purpose.  Wording  and  spelling  errors  in  display 
messages  or  in  the  test  program  instructions  are  considered  minor  defects. 
Again,  we  could  consider  this  in  a  minor  fault  corrected  category. 

The  next  question  that  arises  is:  Why  consider  the  corrected  categories  or 
even  rank  them  differently?  When  using  the  previous  sampling  plan,  two  errors  in 
grammar  (commas,  for  Instance)  would  reject  the  entire  TPS.  Although  they  would 
in  no  way  affect  the  running  of  the  program,  without  ranking  we  are  forced  to 
reject  a  functional  TPS  on  the  basis  of  two  errors  in  grammar!  The  ranking  of 
two  misspellings  with  a  critical  defect  is  unjustifiable.  This  sampling  plan  is 
designed  to  take  the  seriousness  of  the  fault  into  account  with  a  scoring  system 
that  we  feel  fairly  considers  the  seriousness  of  a  fault  In  the  V&V  process. 

For  a  critical  fault  not  corrected,  we  would  add  two  to  the  defect  score  on 
the  graph.  This  would  effectively  reject  the  lot  by  positioning  the  point  in  the 
reject  area  of  the  graph.  A  critical  fault  corrected  would  add  one  to  the  defect 
score  on  the  graph,  and  testing  would  continue  it  the  point  falls  between  the  two 
parallel  lines.  A  major  fault  not  corrected  would  result  in  a  score  of  one  being 
added  to  the  graph,  and  once  again  testing  may  or  may  not  continue.  A  major 
fault  corrected  would  add  0.5  to  the  score  total  and  testing,  again,  may  or  may 
not  continue.  Minor  faults  would  add  0.4  or  0.2  to  the  defect  score  depending  on 
whether  or  not  they  are  corrected.  The  v.ilues  of  these  defect  scores  are  based 
upon  engineering  judgment  of  the  relative  volghts  of  their  seriousness. 


FAULT  SELECTIOH 


The  next  Item  that  should  be  considered  Is  the  placement  of  the  faults  we 
will  Insert  Into  the  board.  It  will  behoove  us  to  test  those  faults  that  will 
actually  occur  to  the  UUT  In  the  field.  Is  It  possible  to  locate  the  bullt-ln 
weaknesses  of  a  particular  board?  The  answer  to  this  question,  in  many  cases,  is 
yes.  What  Is  needed  at  this  time  is  additional  data  which  will  allow  us  to  pre¬ 
dict  and  Isolate  those  components  most  likely  to  fall  in  actual  operation.  Those 
components  will  be  faulted  and  used  to  verify  the  ability  of  a  TPS  to  correctly 
Isolate  faults  on  a  given  circuit  board  using  the  procedure  outlined  above. 
These  data  would  probably  come  from  the  depot  level  and  would  include  data  such 
as  board  type  and  serial  number,  parts  called  out  for  repair,  parts  actually 
replaced,  errors  called  out  by  TPS,  last  station  calibration  check,  etc. 

In  lieu  of  such  data,  the  prediction  of  failures  can  be  accomplished  by 
using  generic  failure  rates  and  part  stress  analysis  of  the  components  on  the 
board.  These  calculations  will  show  the  components  most  likely  to  fail  first. 
Fault  Insertion  should  center  around  these  items  Initially  and  diverge  to  other 
areas  of  lower  reliability  during  the  testing  based  on  this  ^analysis. 

To  determine  the  most  logical  faults  to  be  inserted  on  the  UUT,  we  must 
first  determine  the  relative  reliability  of  each  of  the  components.  This  can  be 
done  through  the  individual  equations  in  Military  Handbook  217U  or  the  relative 
generic  reliabilities  can  be  used.  If  the  circuit  board  is  very  large,  the  cal¬ 
culation  of  each  individual  component  on  the  board  may  not  be  practical.  In  this 
case,  one  would  group  the  components  into  small  sets;  e.g.,  resistors,  ceramic 
capacitors,  IC's,  electrolytic  capacitors,  etc.  The  generic  failure  rates  are 
found  and  recorded.  The  component  with  the  highest  generic  failure  rates  should 
be  targeted  for  the  initial  testing  during  V&V.  Should  there  be  two  or  more 
groups  with  the  same  generic  failure  rates,  the  faults  tested  should  Incorporate 
equal  percentages  of  these  parts. 

A  key  feature  to  examine  is  the  stress  on  the  individual  component.  A  com¬ 
ponent  may  be  operating  at  the  edge  of  its  performance  limit.  Questions  evoking 
unusual  conditions  should  be  a'.ked:  What  would  happen  if  a  spike  were  to  come 
down  an  input  pin?  Is  the  circuit  connected  to  a  high  impedance  load,  such  as  a 
motor,  which  will  generate  considerable  back  EMF  when  the  unit  is  cycled  through 
its  various  phases?  Factors  such  as  these  will  need  to  be  addressed  before  the 
faults  are  chosen. 

With  the  generic  reliability  data  on  hand  along  with  the  circuit  diagram, 
the  engineer  can  estimate  which  components  are  subjected  to  the  highest  stress 
and  from  there,  which  have  the  highest  failure  rates. 


IHPLICATIONS  OP  HXPANDEI)  USE 


This  acceptance  plan  will  provide  the  desired  results  ol  rejecting  .nily 
of  the  time,  lots  containing  0.1%  deflective  Iteins;  but  will  reject  at  least 
of  the  time,  lots  containing  10%  defective  Iteus.  With  the  fault  selection  tecii- 
nlque  described,  meaningful  V&V  testing  can  be  accomplished.  Ihis  procedure  Is 
valid  and  can  be  applied  to  the  testing  of  TPS's  as  well  as  iianul actured  lots. 
It  can  also  be  applied  to  "lots”  of  TPS's.  In  this  case,  a  lot  ot  TPS's  would  be 
defined  as  a  group  of  TPS's  produced  and  used  commonly;  for  Instance,  a  group  of 
TPS's  procured  for  a  certain  weapon  system.  Of  course,  one  should  be  convinced 
that  the  TPS's  are  produced  similarly;  same  programming  style  guidelines,  same 
test  equipment,  perhaps  the  same  programmer.  But  even  one  individual  programmer 
nay  display  various  techniques  throughout  various  test  progra.is.  However,  con¬ 
sider  test  programs  that  are  generated  by  Automatic  Test  Program  Generators 
(ATPG),  or  test  programs  of  the  future  which  may  be  generated  by  artificial  in¬ 
telligence  machines.  If  a  weapon  system  with  120  UUT's  had  been  targeted  for 
TPS's  which  would  all  be  generated  under  a  common  system,  it  would  be  perfectly 
valid  to  test  only  a  portion  of  the  120  TPS's  using  a  test  plan  of  this  nature 
and  accept  or  reject  the  entire  lot  based  on  the  outcome. 

It  Is  recommended  tliat  the  Army  continue  the  use  of  this  type  of  sampling 
plan  to  test  TPS's  and  consider  Its  use  for  accepting  or  rejecting  ATPG  or  AI 
systems  for  generating  TPS's. 


PROBABILITY  OF  ACCEPTING  LOT 


\  .  •  .  »  V> 


0  P,  Pf 

PROPORTION  DEFECTIVE  IN  LOT 


ot  -  Producer  Risk 

B  -  Consumer  Risk 

Pi  -  Acceptable  Quality  Level  (AQL) 

Pf  -  Lot  Tolerance  Percent  Defective  <LTPD) 


Figure  1.  Relation  of  producer  and  consumer  risks,  AQL,  and  LTPD 


Score  vs.  Faults  Iiiserted 
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